Search CORE

378 research outputs found

CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

Author: C. Lawrence Zitnick
Dong Chen
M Everingham
Markus Mathias
Matthew D. Zeiler
P Felzenszwalb
R Girshick
Tsung-Yi Lin
Publication venue
Publication date: 16/06/2016
Field of study

Robust face detection in the wild is one of the ultimate components to support various facial related problems, i.e. unconstrained face recognition, facial periocular recognition, facial landmarking and pose estimation, facial expression recognition, 3D facial model construction, etc. Although the face detection problem has been intensely studied for decades with various commercial applications, it still meets problems in some real-world scenarios due to numerous challenges, e.g. heavy facial occlusions, extremely low resolutions, strong illumination, exceptionally pose variations, image or video compression artifacts, etc. In this paper, we present a face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above. Similar to the region-based CNNs, our proposed network consists of the region proposal component and the region-of-interest (RoI) detection component. However, far apart of that network, there are two main contributions in our proposed network that play a significant role to achieve the state-of-the-art performance in face detection. Firstly, the multi-scale information is grouped both in region proposal and RoI detection to deal with tiny face regions. Secondly, our proposed network allows explicit body contextual reasoning in the network inspired from the intuition of human vision system. The proposed approach is benchmarked on two recent challenging face detection databases, i.e. the WIDER FACE Dataset which contains high degree of variability, as well as the Face Detection Dataset and Benchmark (FDDB). The experimental results show that our proposed approach trained on WIDER FACE Dataset outperforms strong baselines on WIDER FACE Dataset by a large margin, and consistently achieves competitive results on FDDB against the recent state-of-the-art face detection methods

arXiv.org e-Print Archive

Crossref

Towards a reliable face recognition system.

Author: A Ali-Gombe
K Zhang
P Viola
R Girshick
Z Boulkenafet
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2020
Field of study

Face Recognition (FR) is an important area in computer vision with many applications such as security and automated border controls. The recent advancements in this domain have pushed the performance of models to human-level accuracy. However, the varying conditions in the real-world expose more challenges for their adoption. In this paper, we investigate the performance of these models. We analyze the performance of a cross-section of face detection and recognition models. Experiments were carried out without any preprocessing on three state-of-the-art face detection methods namely HOG, YOLO and MTCNN, and three recognition models namely, VGGface2, FaceNet and Arcface. Our results indicated that there is a significant reliance by these methods on preprocessing for optimum performance

Crossref

Open Access Institutional Repository at Robert Gordon University

Multi-view Face Detection Using Deep Convolutional Neural Networks

Author: Garcia C.
Girshick R. B.
Kaiming He S. R.
Krizhevsky A.
Martin Koestinger P. M. R.
Osadchy M.
Osadchy R.
Ramanan D.
Saberian M.
Sermanet P.
Sun Y.
Szegedy C.
Szegedy C.
Szegedy C.
Tompson Y. L.
Vaillant R.
Viola M.
Wu B.
Publication venue
Publication date: 20/04/2015
Field of study

In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR

arXiv.org e-Print Archive

Crossref

Stacking-fault energies for Ag, Cu, and Ni from empirical tight-binding potentials

Author: A. Girshick
D. J. Oh
F. Cleri
J. A. Zimmerman
L. J. Lewis
M. J. Mehl
M. Parrinello
P. C. J. Gallagher
P. Heino
R. Meyer
W. G. Hoover
Publication venue: 'American Physical Society (APS)'
Publication date: 12/07/2002
Field of study

The intrinsic stacking-fault energies and free energies for Ag, Cu, and Ni are derived from molecular-dynamics simulations using the empirical tight-binding potentials of Cleri and Rosato [Phys. Rev. B 48, 22 (1993)]. While the results show significant deviations from experimental data, the general trend between the elements remains correct. This allows to use the potentials for qualitative comparisons between metals with high and low stacking-fault energies. Moreover, the effect of stacking faults on the local vibrational properties near the fault is examined. It turns out that the stacking fault has the strongest effect on modes in the center of the transverse peak and its effect is localized in a region of approximately eight monolayers around the defect.Comment: 5 pages, 2 figures, accepted for publication in Phys. Rev.

arXiv.org e-Print Archive

Crossref

Image Co-localization by Mimicking a Good Detector's Confidence Score Distribution

Author: A Joulin
C Galleguillos
C Rother
CL Zitnick
D Hoiem
JH Hosang
JRR Uijlings
K He
M Everingham
O Russakovsky
PF Felzenszwalb
R Girshick
T Deselaers
W Ren
Y Boykov
Publication venue
Publication date: 01/01/2016
Field of study

Given a set of images containing objects from the same category, the task of image co-localization is to identify and localize each instance. This paper shows that this problem can be solved by a simple but intriguing idea, that is, a common object detector can be learnt by making its detection confidence scores distributed like those of a strongly supervised detector. More specifically, we observe that given a set of object proposals extracted from an image that contains the object of interest, an accurate strongly supervised object detector should give high scores to only a small minority of proposals, and low scores to most of them. Thus, we devise an entropy-based objective function to enforce the above property when learning the common object detector. Once the detector is learnt, we resort to a segmentation approach to refine the localization. We show that despite its simplicity, our approach outperforms state-of-the-art methods.Comment: Accepted to Proc. European Conf. Computer Vision 201

arXiv.org e-Print Archive

Crossref

Adelaide Research & Scholarship

Residual attention regression for 3D hand pose estimation

Author: A Newell
J Li
J Tompson
L Ge
L Hui
Q Ye
R Girshick
W Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/08/2019
Field of study

Crossref

Portsmouth University Research Portal (Pure)

Development of a tight-binding potential for bcc-Zr. Application to the study of vibrational properties

Author: A. Girshick
A. Heiming
A.E. Carlsson
A.M. Stoneham
A.P. Sutton
A.P. Sutton
C. M. Goringe
C. Zener
C.D. Gelatt
C.H. Hodges
D. R. Squire
D.G. Pettifor
D.G. Pettifor
E.G. Moroni
E.S. Fisher
F. Ducastelle
F. Willaime
F. Willaime
I.R. McDonald
J. Friedel
J. Friedel
J. Goldak
J. Inoue
J.C. Slater
J.F. Lutsko
J.P. Gaspard
J.P. Perdew
K. Persson
K.M. Ho
M. Born
M. Nastar
M.W. Finnis
M.W. Finnis
Marcel Porta
O. Lebacq
O.K. Andersen
P. Blaha
R. Haydock
R. Haydock
Teresa Castán
V. Rosato
W.G. Burgers
Z. Nishiyama
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2001
Field of study

We present a tight-binding potential based on the moment expansion of the density of states, which includes up to the fifth moment. The potential is fitted to bcc and hcp Zr and it is applied to the computation of vibrational properties of bcc-Zr. In particular, we compute the isothermal elastic constants in the temperature range 1200K < T < 2000K by means of standard Monte Carlo simulation techniques. The agreement with experimental results is satisfactory, especially in the case of the stability of the lattice with respect to the shear associated with C'. However, the temperature decrease of the Cauchy pressure is not reproduced. The T=0K phonon frequencies of bcc-Zr are also computed. The potential predicts several instabilities of the bcc structure, and a crossing of the longitudinal and transverse modes in the (001) direction. This is in agreement with recent ab initio calculations in Sc, Ti, Hf, and La.Comment: 14 pages, 6 tables, 4 figures, revtex; the kinetic term of the isothermal elastic constants has been corrected (Eq. (4.1), Table VI and Figure 4

arXiv.org e-Print Archive

Crossref

Secretaría de Estado de Cultura

Diposit Digital de la Universitat de Barcelona

Tweeting Cameras for Event Detection

Author: Becker H.
Girshick R. B.
Karpathy A.
Kranz M.
Los Angeles D.
Sheth A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

A novel infrared video surveillance system using deep learning based techniques

Author: Andrew Parmley
C Dong
Chunbo Luo
D Felzenszwalb
D Lowe
Huaizhong Zhang
J.C. Nascimento
Jesus Monge-Alvarez
K de Sande van
M Everingham
Matthew Kitchin
Pablo Casaseca-de-la-Higuera
Qi Wang
R Girshick
S Bagavathiappan
T Fawcett
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/04/2018
Field of study

This is the author accepted manuscript. The final version is available from Springer via the DOI in this record.This paper presents a new, practical infrared video based surveillance system, consisting of a resolution-enhanced, automatic target detection/recognition (ATD/R) system that is widely applicable in civilian and military applications. To deal with the issue of small numbers of pixel on target in the developed ATD/R system, as are encountered in long range imagery, a super-resolution method is employed to increase target signature resolution and optimise the baseline quality of inputs for object recognition. To tackle the challenge of detecting extremely low-resolution targets, we train a sophisticated and powerful convolutional neural network (CNN) based faster-RCNN using long wave infrared imagery datasets that were prepared and marked in-house. The system was tested under different weather conditions, using two datasets featuring target types comprising pedestrians and 6 different types of ground vehicles. The developed ATD/R system can detect extremely low-resolution targets with superior performance by effectively addressing the low small number of pixels on target, encountered in long range applications. A comparison with traditional methods confirms this superiority both qualitatively and quantitativelyThis work was funded by Thales UK, the Centre of Excellence for Sensor and Imaging System (CENSIS), and the Scottish Funding Council under the project “AALART. Thales-Challenge Low-pixel Automatic Target Detection and Recognition (ATD/ATR)”, ref. CAF-0036. Thanks are also given to the Digital Health and Care Institute (DHI, project Smartcough-MacMasters), which partially supported Mr. Monge-Alvarez’s contribution, and to the Royal Society of Edinburgh and National Science Foundation of China for the funding associated to the project “Flood Detection and Monitoring using Hyperspectral Remote Sensing from Unmanned Aerial Vehicles”, which partially covered Dr. Casaseca-de-la-Higuera’s, Dr. Luo’s, and Prof. Wang’s contribution. Dr. Casaseca-de-la-Higuera would also like to acknowledge the Royal Society of Edinburgh for the funding associated to project “HIVE”

Crossref

Edge Hill University Research Information Repository

Open Research Exeter

Research Repository and Portal - University of the West of Scotland

Probabilistic Computation in Human Perception under Variability in Encoding Precision

Author: A Faisal
A Pouget
A Sapir
AK Churchland
AR Girshick
AR Girshick
CE Connor
CJ McAdams
D Alais
D Tolhurst
DC Knill
DJ MacKay
DM Green
E Matthias
ED Gershon
F Pestilli
GU Yule
H Pashler
H Seung
HY Eng
J Palmer
L Reddy
L Whiteley
LW Nolte
Marc O. Ernst
MI Posner
MM Churchland
MN Shadlen
MO Ernst
MP Eckstein
MR Cohen
N Cowan
P Wilken
R Desimone
R Natarajan
R Van den Berg
R Van den Berg
RLT Goris
Ronald van den Berg
RS French
S Baldassi
S Sadaghiani
S Saproo
Shaiyan Keshvari
SJ Luck
TF Brady
TJ Brady
WA Phillips
Wei Ji Ma
WJ Ma
WJ Ma
WJ Ma
WJ Ma
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

A key function of the brain is to interpret noisy sensory information. To do so optimally, observers must, in many tasks, take into account knowledge of the precision with which stimuli are encoded. In an orientation change detection task, we find that encoding precision does not only depend on an experimentally controlled reliability parameter (shape), but also exhibits additional variability. In spite of variability in precision, human subjects seem to take into account precision near-optimally on a trial-to-trial and item-to-item basis. Our results offer a new conceptualization of the encoding of sensory information and highlight the brain’s remarkable ability to incorporate knowledge of uncertainty during complex perceptual decision-making

CiteSeerX

DSpace@MIT

Crossref

Directory of Open Access Journals

Publikationer från Uppsala Universitet

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

FigShare